fix: handle Anthropic-style token reporting in cached billing#5
fix: handle Anthropic-style token reporting in cached billing#5
Conversation
CalculateCost now accepts CacheUsage and bills cached tokens at the CacheRead rate instead of the full Input rate. This correctly reflects the cost savings from prompt caching (typically 50-90% cheaper). Changes: - BillingResult: added CachedTokens and CachedInputCost fields - CalculateCost: accepts *CacheUsage, splits cached/non-cached pricing - BillingInterceptor: extracts cache_usage from ResponseMetadata.Custom and passes it to CalculateCost
Tests cover: - No cache usage (nil) — all tokens at full rate - OpenAI cache hit (CachedTokens field) - Anthropic cache hit (CacheReadInputTokens field) - Cache usage present but zero tokens — treated as no caching - Cached tokens exceeding prompt tokens — clamped - No CacheRead price — falls back to full input rate - All tokens cached — zero non-cached cost - Zero tokens — zero cost - Mixed provider cache fields (summed) - Interceptor extracting cache_usage from ResponseMetadata.Custom - Interceptor with nil/empty Custom map
…es cached Anthropic reports input_tokens as non-cached only, while OpenAI includes cached tokens in prompt_tokens. CalculateCost now detects the style by comparing: if cachedTokens > promptTokens, the provider is reporting non-cached only, and promptTokens is adjusted to reflect the true total. Fixes incorrect billing where Anthropic cache hits were clamped to the small input_tokens count instead of reflecting the full cached amount.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 4 minutes and 6 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
Comment |
Summary
Fixes a billing bug where Anthropic cache hits were incorrectly billed. Anthropic reports
input_tokensas non-cached only, while OpenAI includes cached tokens inprompt_tokens. The previous code clamped cached tokens to prompt tokens, which collapsed 2022 cached tokens down to 3.The Bug
Fix
CalculateCostnow detects the reporting style:cachedTokens <= promptTokens→ OpenAI style (prompt includes cached) →nonCached = prompt - cachedcachedTokens > promptTokens→ Anthropic style (prompt is non-cached only) →nonCached = prompt, adjust totalTests Updated
TestCalculateCost_WithAnthropicCacheHit— real-world values (input_tokens=3, cache_read=2022)TestCalculateCost_CacheExceedsPromptTokens_AnthropicStyle— verifies PromptTokens adjusted to true total